Add Smallest AI STT and TTS service integrations#4014
Draft
markbackman wants to merge 6 commits intomainfrom
Draft
Add Smallest AI STT and TTS service integrations#4014markbackman wants to merge 6 commits intomainfrom
markbackman wants to merge 6 commits intomainfrom
Conversation
- STT: Update model from lightning to pulse with new API URL - STT: Add SmallestRealtimeSTTService using Pulse WebSocket API for low-latency streaming transcription - TTS: Add lightning-v3.1 model and set as default - stt_latency: Add SMALLEST_TTFS_P99 constant Made-with: Cursor
Migrate STT/TTS services from deprecated set_model_name()/set_voice() to the
new ServiceSettings pattern (STTSettings/TTSSettings). Add default voice_id
("sophia") for TTS services, fix voice references, and include two foundational
example scripts showing WebSocket and HTTP usage.
Made-with: Cursor
Made-with: Cursor
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
Contributor
Author
|
@harshitajain165 I started from your branch and focused on the services best suited for real-time. That is, the websocket services. The TTS service is working well and would be ready for production use. I'm struggling with the STT service—I see that the transcriptions are usually missing words and I get hallucinations that appear ~3 seconds after the final transcript arrives. Have you seen this with other implementations and do you have any other recommendations? As it stands, the STT service is no fit for production, so I'd like to iron out the issues before merging this. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds Smallest AI WebSocket-based STT and TTS services, aligned with current Pipecat conventions.
Based on and supersedes #3897 by @harshitajain165. Key changes from the original PR:
Test plan